What are Captions? In a video, captions collect all audio information and describe them using text. They include not only spoken content but also non-speech information such as sound effects, music, laughter, and speaker identification and location (for example, audio spoken off-screen). Captions appear transposed over the visual elements in